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Abstract —We investigate the problem of optimal request rout¬ 
ing and content caching in a heterogeneous network supporting 
in-network content caching with the goal of minimizing average 
content access delay. Here, content can either be accessed directly 
from a back-end server (where content resides permanently) 
or be obtained from one of multiple in-network caches. To 
access a piece of content, a user must decide whether to route 
its request to a cache or to the back-end server. Additionally, 
caches must decide which content to cache. We Investigate the 
problem complexity of two problem formulations, where the 
direct path to the back-end server is modeled as ;) a congestion- 
sensitive or ii) a congestion-insensitive path, reflecting whether 
or not the delay of the uncached path to the back-end server 
depends on the user request load, respectively. We show that the 
problem is NP-complete in both cases. We prove that under the 
congestion-insensitive model the problem can be solved optimally 
in polynomial time if each piece of content is requested by only 
one user, or when there are at most two caches in the network. We 
also identify a structural property of the user-cache graph that 
potentially makes the problem NP-complete. For the congestion- 
sensitive model, we prove that the problem remains NP-complete 
even if there is only one cache in the network and each content is 
requested by only one user. We show that approximate solutions 
can be found for both models within a (1 — 1/e) factor of 
the optimal solution, and demonstrate a greedy algorithm that 
is found to be within 1% of optimal for small problem sizes. 
Through trace-driven simulations we evaluate the performance 
of our greedy algorithms, which show up to a 50% reduction in 
average delay over solutions based on LRU content caching. 

I. Introduction 

In-network content caching has received considerable atten¬ 
tion in recent years as a means to address the explosive growth 
in data access seen in today’s networks. Its main premise is 
to store content at the network’s edge - close to the end users 
- to reduce user content access delay and network bandwidth 
usage. The benefits of in-network content caching have been 
demonstrated in the context of CDN 0-0 as well as hybrid 
networks comprised of cellular and MANETs or femto-cell 
networks 0-0- 

In this paper, we investigate a joint problem of in-network 
content caching and request routing in a hybrid network where 
stored content can be accessed through multiple heterogeneous 
network paths. We consider a scenario in which users send 
requests for content that is always available at a remote back¬ 
end server located in the network core but may also be 


present at multiple in-network caches. Access to the back-end 
server employs a potentially costly, congested, and/or slower 
uncached path, while the in-network caches may be reached 
through cheaper and faster network paths. This scenario 
arises in various hybrid network contexts where content can 
be accessed through multiple heterogeneous paths, including 
core/edge CDNs, macro/femto cell networks, cellular/MANET 
networks (e.g., where the path to the network core is over 
cellular infrastructure and in-network caches are accessible via 
MANET paths), and cloud/edge cellular networks with edge 
storage at the cellular base stations. If a request is routed to an 
in-network cache that has the requested content, the request 
is served immediately. Otherwise, the cache must download 
the content from the back-end server before serving it to the 
user, incurring additional delay. Additionally, the cache must 
decide whether or not to store the downloaded content. 

We address the following question: how should users route 
their requests among the paths to in-network caches and 
the back-end server, and what in-network cache management 
policy should be adopted to minimize the average content 
access delay across all users? We consider two variants of 
the problem. Eirst, we consider a congestion-insensitive delay 
model (termed Cl-model), assuming that delays are indepen¬ 
dent of the traffic load on all paths. Second, we consider a 
congestion-sensitive delay model (termed CS-model), assum¬ 
ing that the delay to the back-end server {i.e., the uncached 
path) depends on the traffic load. In a hybrid cellular/MANET 
network, the uncached path in the CTmodel corresponds to 
GBR (guaranteed bit rate) 3GPP bearer service, while in the 
CS-model it corresponds to Non-GBR Aggregate Maximum 
Bit Rate (AMBR) bearer service Q. 

Our goal in this paper is two-pronged. Eirst, we seek a 
principled understanding of the computational complexity of 
the joint caching and routing problem: i) Can the general 
problem be solved optimally in polynomial time? ii) If not, 
are there problem instances that are tractable and which of the 
above modeling aspects make the general problem intractable? 
Second, we seek efficient approximate solutions to the joint 
routing/caching problem, with approximation guarantees, that 
work well in practice. 

Toward our first goal, we provide a unified optimization 


formulation of the joint caching and routing problem for 
both models and show this problem is NP-complete in the 
general case. Then, we investigate which factors contribute 
to the problem complexity. For the CTmodel, we prove that 
the optimal solution can be found in polynomial time in two 
special cases: a) when each user requests a single piece of 
content, or b) when there are at most two in-network caches. 
We also identify a condition which is potentially the root 
cause of the complexity of the problem in the general case 
— cycles with an odd number of users and caches in a graph 
that represents the network. For the CS-model, we prove that 
the problem is “harder”: it remains NP-complete even if there 
is a single cache and each user accesses a single distinct 
content. These results provide valuable insights on the problem 
complexity. 

Toward our second goal, we show that the problem of 
optimal joint caching and routing for both the CTmodel 
and the CS-model can be formulated as maximization of a 
monotone submodular function subject to matroid constraints. 
This enables us to devise two greedy algorithms. The hrst one 
has a higher complexity but can produce solutions within a 
(1 — 1/e) factor of the optimal solution for both the CTmodel 
and CS-model. The second algorithm has a lower complexity 
but does not have known approximation guarantees. We eval¬ 
uate the performance of these algorithms through numerical 
evaluations and trace-driven simulations on a large dataset of 
approximately 9 million requests for 3 million content items. 
The results show that both algorithms are within 1% of the 
optimal for small problem sizes where computing the optimal 
solution is feasible and that significant reductions (up to 50%) 
in content access delay can be achieved over traditional LRU- 
based content caching schemes. 

Our contributions can be summarized as follows: 

• We provide a unified optimization formulation for the 
joint caching and routing problem for the CTmodel and 
the CS-model and prove that the problem is NP-complete 
in both cases. 

• We derive insights into problem complexity by consid¬ 
ering several special cases, some of which are shown to 
admit efficient solutions, while others remain computa¬ 
tionally hard. 

• We develop a greedy caching and routing algorithm that 
achieves an average delay within a (1 — 1 /e) factor of the 
optimal solution and a second greedy algorithm of lower 
complexity. 

• We evaluate the performance of these algorithms through 
numerical evaluations and trace-driven simulations. Nu¬ 
merical results show that the greedy algorithms perform 
close to the optimal solution when computing the optimal 
solution is feasible. Our results from trace-driven simu¬ 
lations show that the greedy algorithms yield signihcant 
performance improvement compared to solutions based 
on traditional LRU caching policy. 

The paper is organized as follows. The network model 
and the joint caching and routing problem formulation are 
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Fig. 1: Hybrid network with in-network caching 

presented in Sections |I^ and [II^ Sections |IV] and |Vj present 
the complexity results for the congestion-insensitive and 
congestion-sensitive cases, respectively. The approximation 
algorithms are presented in Section IVl] and their performance 
is evaluated in Section [Vn] Section [VIII| reviews related work, 
and Section [IX| concludes the paper. 

IT Network Model 

In this section, we consider the network shown in Figure [T] 
with N users generating requests for a set of K unique files 
F = {/i,/ 2 ,...,/if} of unit size. Throughout this paper, we 
will use the terms content and file interchangeably. We assume 
that these hies reside permanently at the back-end server. As 
shown in Figure there are M caches in the network that 
can serve user requests. 

All hies are available at the back-end server and users are 
directly connected to this server via a cellular infrastructure. 
We refer to the cellular path between the user and the back¬ 
end server as the uncached path. Each user can also access a 
subset of the M in-network caches where the content might 
be cached. We refer to the connection between the user and a 
cache as a cached path. 

Let Cm denote the storage capacity of the m-th cache 
measured by the maximum number of hies it can store. If 
user i requests hie j and it is present in the cache, then the 
request is served immediately. We refer to this event as a cache 
hit. However, if content j is not present in the cache, the cache 
then forwards the request to the back-end server, downloads 
hie J from the back-end server and forwards it to the user. We 
refer to this event as a cache miss, since it was necessary to 
download content from the back-end server in order to satisfy 
the request. Note that in case of a cache miss, the cache can 
decide whether to keep the downloaded content. 

User i generates requests for the hies in F according to a 
Poisson process of aggregate rate A^. Aggregate request rate 
of all users is A. We denote by qij the probability that user i 
generates a request for hie j (referred to as the file popularity). 
The popularity of the same hie can vary from one user to 
another. 

Let A = [aim] denote the connections between users and 
caches, with aim = 1 if user i is connected to cache m, 













and Qim = 0, otherwise. For the user-cache connections, let 
and denote the average delays incurred by user i 
in the event of a cache hit or miss at cache m, respectively. 
We assume without loss of generality that i.e., 

cache misses always incur greater delays than cache hits. We 
consider two models for the delay over the path from users 
to the back-end server. The first is a congestion-insensitive, 
constant-delay model where the delays through the uncached 
path are independent of the traffic load on the link to the back¬ 
end server. In this case, the average delay experienced for a 
request by user i sent over the uncached path is d\. The second 
model is a congestion-sensitive delay model where delays 
experienced over the uncached path depend on the traffic load. 
In this case, we assume the back-end server has service rate 
fi, and model the connections to the back-end server as an 
M/M/1 queue. The delay experienced over the uncached path 
then consists of an initial access delay with average dj and a 
queuing (waiting plus service) delay with average —Ag), 
assuming Xq is the request rate on the queue. 


III. Problem Formulation 


In this work, we consider a joint caching and routing 
problem with the goal of minimizing average content access 
delay over the requests of all users for all files. The solution to 
this problem requires addressing two closely-related questions 
1) How should cache contents be managed - which files should 
be kept in the caches, and what cache replacement strategy 
should be used? and 2) How should users route their requests 
between the cached and uncached paths? 

For our routing policy, we define the decision variable pijm 
that denotes the fraction of the requests of user i for content 
j sent to cache m. User i sends the remaining 1 — J^mPijrn 
fraction of her requests for content j to the back-end server 
through the uncached path. 

It is shown in Q for a single cache that given a routing 
policy, static caching achieves minimum expected delay. With 
static caching, a set of files is stored in the cache, and the cache 
content does not change in the event of a cache hit or miss. The 
argument in Q was extended in 0 to a network of caches 
to show that static caching achieves minimum delay under 
a fixed routing policy. Hence, we define the binary variables 
Xjm G {0,1} to denote the content placement in caches, where 
Xjm = 1 indicates file j is stored in cache m and Xjm = 0 
indicates otherwise. 

We denote by I?(x,p) the expected delay obtained by 
content placement strategy x = [xjm], and routing strategy 
P = [Pijm]- The optimal solution to the problem of joint 
caching and routing is therefore obtained by solving the 


following Mixed-Integer Program (MIP): 


minimize D{x,p) 

such that ''y^Pijm < 1 

'di,j 
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^ ^ ^jm ^ Cm 
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dm 
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^jm ^ {0, 1} 

dj,m 

0 ^ Pijm — ^im 
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In the next two sections, we express the delay function 
iA(x, p) for the cases of i) congestion-insensitive and ii) 
congestion-sensitive uncached path delay models, and discuss 
why the joint caching and routing problem is NP-complete. 


IV. Congestion-Insensitive Uncached Path 


First, we consider the case where delays on the uncached 
path, dj, do not depend on the traffic load on the back-end 
server. For a given content placement x and routing policy p, 
the average delay can be written as 

p) = ^ 'y ] y ] XiQij ( 'y^PijmXjmdiyyi 

i j \ m 

+ y \ pijmix ~ ^jm)d^^ -f ^1 — y ' pijjyi) dj^ j (2) 

m m / 

Without loss of generality, we assume that 
dL < d\ < d^ whenever user i is connected to 
cache m, i.e., aim = 1- Note that if d\ > d^,yi, users 
connected to cache m will never use the uncached path. Also, 
if d\ < dy.^yi, none of the users will actually use cache m. 

It is easy to see that with the congestion-insensitive model, 
given a content placement, the average minimum delay is 
obtained by routing requests for the cached content to caches, 
and routing the remaining requests to the uncached path. 
Note that under this routing policy no cache misses occur. 
Therefore, the solution to the problem of joint caching and 
routing in the case of congestion-insensitive uncached path 
delays are obtained by solving the following binary linear 
program; 


minimize 




q^3 


such that ''y^Pijm < 1 

m 



y '^ Pijmdjm + (1 y ^, Pijm)di 

_ m m . 


Pijm ^ ttijm ‘ tlim 

Pijm ^ 0- 

(3) 


Note that Z?(x, p) is a linear function of the routing variables. 
Also note the additional constraint pijm < Xjm ■ a-im, which 
is due to the fact that only requests for cached content are 




routed to caches. Since dim < d\ and d\ < d^, users have 
no incentive to split the traffic for any content between the 
cached and uncached paths, and hence there will be no routing 
variable, pijm, with a fractional value in the optimal solution, 

i.6.^ Pijm € {0; !}• 

A. Hardness of General Case 

The above formulation of the joint caching and routing 
problem is a generalization of the Helper Decision Problem 
(HDP) proved to be NP-complete in p0| . Our formulation is 
more general as we consider non-homogeneous delays for the 
cached and uncached paths. HDP reduces to the optimization 
problem in ([^ by setting = 1, d^^ = 0, and Cm = C, 
where C is the cache size at all caches in HDP. 

Although the problem is NP-complete in general, we will 
show that the joint caching and routing problem can be solved 
in polynomial time for several special cases, and discuss what 
makes the problem “hard” in general. We first consider a 
restrictive setting where each user is interested in only one file 
and each file is requested by only one user. Next, we consider a 
network with two caches (but each user may be interested in an 
arbitrary number of files). We present polynomial time solution 
algorithms for both cases. Finally, we present an example 
that demonstrates what we conjecture to be the source of the 
complexity of this problem. 

B. Special Case: One File per User 

Consider the network illustrated in Figure [T] but assume 
each user is interested in only one file, i.e., qu = 1, and qij = 0 
for i ^ j. In this case, the optimal solution to the joint caching 
and routing problem can be found in polynomial time based 
on a solution to the maximum weighted matching problem. 

Note that in this case, the number of files equals the number 
of users, i.e., N = K.Ho avoid triviality, we assume that the 
number of users is larger than the capacity of each cache in 
the network, i.e.. Cm < N,\/m. The assumption that each user 
is interested in only one file allows us to re-write the objective 
function in ([^ as 

p) = — 'y ' ^ y ] ^iPiimdim + ^i(l ~ ^ \ Piim)d^ 

i m m 

-. N 

= X (E -EE - d^m)) ■ 

i—l i m 

Since ^id\ is a constant independent of the decision 

variables, minimizing the above objective function is equiv¬ 
alent to maximizing J2iJ2m ~ dim)- Note that 

\i{d\ — d^.^) can be interpreted as the gain obtained by having 
file i in cache m. This problem can then be naturally seen as 
matching files to caches with the goal of maximizing the sum 
of individual gains. In what follows, we map this problem to 
the maximum weighted matching problem. 

For each cache of size Cm, we introduce Cm nodes 
..., representing unit size micro-caches that 

form cache m. Let V = {vi,vf,... 

denote the set of of all such nodes, and let 


U = {ui,U 2 ,... ,U]y} denote the set of all files. We 

define the bipartite graph C(U,V,E) with \i{d\ — 
as the weight of the edges connecting node Ui to nodes 
v^,\/s € {1, 2,..., Cm}- Figure demonstrates a bipartite 
graph with user/file nodes u and the micro-cache nodes v 
with the edge weights shown for some of the edges. Note 
that the bipartite graph consists of \U\ -f |y| = TV -f y^m 
vertices and Ii?I = 0(iV C^) edges. 

The optimal solution to the joint content placement and rout¬ 
ing problem corresponds to the maximum weighted matching 
for graph G. The edges selected in the maximum matching 
determine what content should be placed in which cache. 
Users then route to caches for cached content, and to the 
uncached path for the remaining files. 

The maximum weighted matching problem for bipartite 
graphs can be solved in 0(|Up|i?|) using the Hungarian 
algorithm 1111. In our context, the complexity is 0{M^N^). 
Note that J2m “ 0{MN) as we assumed Cm < N,ym. 
Therefore, we can solve the joint caching and routing problem 
in polynomial time when users are interested in one file only. 




Fig. 2: Modeling content placement as a maximum weighted matching 
problem 

C. Special Case: Network with Two Caches 

Next, we show that the optimal solution for the joint caching 
and routing problem can be found in polynomial time when 
there are only two caches in the network. Specifically, we 
prove that the solution to the integer program Q can be found 
in polynomial time when there are two caches in the network. 
In the remainder of this section, we assume that Xjm take real 
values. 

Before delving into the proof we introduce some definitions 
and results from d: 

Definition 1. A square integer matrix is called unimodular if 
it has determinant -fl or —1. 

Definition 2. An mxn integral matrix A is totally unimodular 
if the determinant of every square submatrix is 0, 1, or —1. 

Proposition 1. If for a linear program {maxc^x : Ax < b}, 
A is totally unimodular and h is integral, then there is an 
optimal solution to the linear program that is integral. 
















Fig. 3: A network with three users and three caches. Each user is in the 
communication range of two of the caches. 



Fig. 4: (a) A network of three users connected to three caches forming a cycle, 
(b) Optimal content placement according to binary placement decisions, i.e., 
Xjm G {0,1}. (c) Optimal content placement assuming fractions of files can 
be stored in caches, i.e., 0 < Xjm ^ 1- (d) Optimal content placement with 
the possibility of content coding. 


Note that the three sets of constraints in the optimization 
problem in namely, i) < 1, ii) Ej < Cm, 

and iii) Pijm — Xjm ■ aim < 0 can be written in the form 
Az < b where the entries of A and b are all integers, and 
z consists of the Xjm and p,jm entries. From Proposition 
then, it suffices to show that the matrix A is totally unimodular 
for a network with two caches to prove that optimization ([^ 
can be solved in polynomial time. To prove that the matrix A 
is totally unimodular we use the following result from 113|: 


Proposition 2. A matrix is totally unimodular if and only if 
for every subset R of rows, there is an assignment s : i? —±1 
of signs to rows so that the signed sum (which 

is a row vector of the same width as the matrix) has all its 
entries in {0, ±1}. 


In Appendix we give a constructive proof showing that 
for any subset R of rows of A we can hnd an assignment s 
that satishes Proposition]^ 


D. Complexity Discussion 

Although the problem of joint caching and routing is NP- 
complete in general, in the previous subsections we showed 
that several non-trivial special cases of the problem can be 


solved in polynomial time. In this section, we discuss the 
potential cause of complexity of the problem. 

By relaxing the integer constraints on content placement 
variables, Xjm, and allowing them to take real values, i.e., 
0 < Xjm < 1. we obtain another problem that is generally 
referred to as the “relaxation” of problem Since the 
objective function in Q is convex, the solution to the relaxed 
problem can be found in polynomial time for all instances of 
the problem. 

By comparing the solutions to the integer and the relaxed 
problems for a large number of instances of the optimization 
problem in (j^ we discuss what we believe is the root cause of 
the complexity of the joint caching and routing problem in the 
case of congestion-insensitive uncached path. We observe via 
numerical evaluations that for most instances of the problem, 
solutions to problem ([^ match those of the relaxed problem. 
Those instances that result in different solutions to the two 
problems exhibit a certain structure that we explain here. 

Consider a network with three users and three caches as 
depicted in Figure With each user connected to two of the 
caches, the user-cache connections can be seen to form a cycle 
as demonstrated in Figure |^. Assume all paths from users to 
caches have equal hit and miss delays. Also, assume that each 
cache has the capacity of storing one hie, and that all three 
users are interested in two hies, noted here as green and red. 

Solving the optimization problem (j^ for the above network, 
the optimal content placement is to replicate one of the 
hies in two of the caches, and have one copy of the other 
hie in the third cache, as shown in Figure |^. With this 
content placement, two of the users get both hies from their 
cached paths, and the third user gets only one hie from 
cached paths, and has to use the uncached path to get the 
other hie. The solution to the relaxed optimization problem 
however would be to store half of each hie in each cache, i.e., 
xim = X 2 m = 0.5, which achieves strictly smaller average 
delay. This solution is illustrated in Figure HjH 

The above discussion shows how the solution to optimiza¬ 
tion problem Q differs from its relaxed counterpart for the 
network shown in Figure It can easily be seen that matrix 
A corresponding to the linear constraints Ax < b for this 
network is not totally unimodular. 

Such mismatch between the solutions of the optimization 
problem ([^ and the corresponding relaxed problem, is also 
observed for larger networks that contain odd number of users 
and odd number of caches connected in a way that form a 
cycle. It is easy to show for all these networks that the matrix 
A of the constraints is not totally unimodular. 

We conjecture that these cycles are the source of complexity 

'Note that we do not consider the solution of the relaxed problem as a 
legitimate content placement. Although it looks like all users can access the 
two files via the caches in Figure®, when splitting the files in halves, two 
of the caches will store the same half copy of a file, and the user connected 
to those caches will only get half of that file from the caches and still needs 
to use the uncached path for the other half. However, we acknowledge that 
with the possibility of coding, content placement can be done in such a way 
that users can get both files from caches, as is shown in Figure 1^. We are 
not considering coded content placement in this work. 




in the problem of joint caching and routing, and for networks 
that do not have any such cycles the solution to the optimiza¬ 
tion problem ([^ matches that of the relaxed problem. More 
specifically we have the following conjecture: 

Conjecture 1. The optimal solution to the problem of joint 
caching and routing can be found in polynomial time if there 
are no cycles of length Ak 2,k > 1 in the bipartite graph 
corresponding to the user-cache connections. 


V. Congestion-Sensitive Uncached Path 


Next, we consider the case where delays on the uncached 
path depend on the traffic load on the back-end server. We 
model the uncached path as an M/M/1 queue with service 
rate p.. In addition to the queuing delay, we assume that user i 
observes an initial access delay to uncached path with average 
d'l,i = 1,..., A^. Here, we make no assumptions regarding 
and d^j^. Note that if < d^ and the needed object is in 
the cache user i will direct all her requests for that object to 
cache. If ^im > however, even if the needed content is 
in cache m, the user may prefer to use the uncached path, 
depending on the service rate and the load on the back-end 
server. For a given content placement x and routing policy p, 
the average delay can be written as 


^(x,p) = ^ 


EE ^iQij ( ^ 

. i j \ m 

+ 'y ' (1 ~ ^jm)Pijmdi^ -f (1 — y ' pijmfd^ j 

m m / 


d ~ yhi Si ~ 


(4) 


A. Hardness of General Case 


Note that we can consider the congestion-insensitive delay 
model as a special case of the congestion-sensitive model 
where p = -|-c». This explains why this problem is NP- 
complete in general. In the remainder of this section, however, 
we will prove that the problem of joint caching and routing 
in the case of a congestion-sensitive delay model remains NP- 
complete even if there is only one cache in the network and 
each content is of interest to no more than one user. 


B. Hardness of Single-Cache Case 

Modifying the delay function D{x,p) in 0 for the case of 
one cache, i.e., M = 1, and assuming each user is interested in 
only one file, i.e., qu = 1, Vi, we can re-write the optimization 


problem as 


minimize 


N 


N 


KxiPidll + Ai(l - Xi)p^d!l 


.i-1 
N 


i=l P Pi) 


N 


such that Xi < C 


i=l 


0 < Pi <ai 

Xi e {0,1}, 

(5) 

where pi = pm denotes the probability that user i will use the 
uncached path. Also, Oi denotes whether user i is connected 
to the cache. 

To show that the above optimization problem is NP- 
complete, we consider the corresponding decision problem. 
Congestion Sensitive Delay Decision Problem (CSDDP). 


Problem 1. (Congestion Sensitive Delay Decision Problem) 
Let A = [Ai, A 2 ,..., A^v] denote the request rates of users, 
and let = [d^], d™ = [d™] and d^ = [d^] denote 
the hit delay, miss delay and initial delay of uncached path, 
respectively. Also, let p be the service rate of the back-end 
server, and C be the cache capacity. 

We are asking the following question: given the parameters 
(/r, A, d^, d™, d^, C) and a real number d, is there any 
assignment of y: = [xi] and p = [pi] such that iA(x, p) < d. 


It is clear that for any given content placement x and routing 
policy p the answer to CSDDP can be verified in polynomial 
time, and hence CSDDP is in class NP. To prove that CSDDP 
is NP-hard, we use the fact that the following problem is NP- 
hard. 


Problem 2. (Equal Cardinality Partition) Given a set A of n 
numbers, can A be partitioned into two disjoint subsets Ai 
and A 2 such that A = Ai UA 2 , the sum of the numbers in Ai 
equals the sum of the numbers in A 2 and that |Ai| = |A 2 |.^ 

Lemma 1. ECP is NP-hard. 


Proof. A proof of NP-hardness of a more general form of ECP 
is given in lEI- Here, we give a simpler proof by a reduction 
from the Partition problem. 

Problem 3. (Partition) Given a set A of n positive integers, 
can A be partitioned into two disjoint subsets Ai and A 2 such 
that A = Ai LI A 2 and the sum of the numbers in Ai equals 
the sum of the numbers in A 2 ? 

For each instance of Partition with input A = {oi,..., a„} 
create an instance A' — {oi,..., a„, 0,..., 0} by adding n 
zeros to A. It is easy to see that A' can be partitioned into 
two subsets with equal cardinality if and only if A can be 
partitioned. Therefore, Partition <p ECP, and ECP is NP-hard. 

□ 


Lemma 2. CSDDP is NP-Complete. 








Proof. See Appendix for a detailed proof. □ 

Although this problem is NP-complete even in a very 
restrictive case with one cache and each user requesting one 
file, in the next section we show that a greedy algorithm can 
find approximate solutions with guaranteed performance. 

VI. Approximation Algorithms 

In this section, we show that the problem of joint caching 
and routing (for both congestion-insensitive and congestion- 
sensitive delay models) can be formulated as the maximization 
of a monotone sub modular function subject to matroid con¬ 
straints. This enables us to devise algorithms with provable 
approximation guarantees. 

We first review the definition and properties of ma- 
troids HU, and monotone m and submodular tn func¬ 
tions, and then prove our problem can be formulated as the 
maximization of a monotone submodular function subject to 
matroid constraints. 

Definition 3. A matroid M is a pair M = (5', I), where S is 
a finite set and / C 2'® is a family of subsets of S with the 
following properties: 

1) 0e/, 

2) I is downward closed, i.e., ifYGl and X CY, then 
X G I, 

3) If X,Y G I, and |Ai| < \Y\, then 3y G Y\X such that 
X\J{y}G 1. 

Definition 4. Let S be a finite set. A set function / : 2'^ —M 
is submodular if for every X,Y <G S with X C Y and every 
X G S\Y we have 

fix U {4) - fix) > fiY U {x}) - fiY). 

Definition 5. A set function f is monotone increasing if X C 
Y implies that /(X) < fiY). 

Let Xm denote the set of files stored in cache m, and define 
X = Xi U X 2 U ... U Xm to be the set of files stored in 
the M caches. X is the set equivalent of the binary content 
placement x defined in Q- Note that I < Cm 

Let Sm = {sim,S2m, ■ ■ ■ ,SKm} denote the set of all 
possible files that could be placed in cache m where sjm 
denotes the storage of file j in cache m. The set element 
Sjm corresponds to the binary variable Xjm defined in the 
optimization problem Q such that Xjm = 1 if and only if the 
element Sjm G X. Define the super set S = S'lU^U.. .US'm 
as the set of all possible content placements in the M caches. 
We have the following lemma. 

Lemma 3. The constraints in ([T]) form a matroid on S. 

Proof. For a given content placement x, the optimal rout¬ 
ing policy can be computed in polynomial time since 
Dip) = Z?(p;x) is a convex function. With that in mind, 
we can write the average delay as a function of the content 
placement X C S'. Thus, the constraints on the capacities of 
the caches can be expressed as X C Z where 

Z = {X C s : ixns^l <Cm,ym = l,...,M}. 


Note that (S, Z) defines a matroid. 


□ 


Let diji'x.) denote the minimum average delay for user i 
accessing file j through a cached path, given content placement 
X. We have 

dijiyfj = min dijm, 


where dijm denotes the average delay of accessing content j 
from cache m defined as (Xjm indicates that file j is in cache 
m) 


dj - — d- X ■ 


^imi^ ^jm)- 


Given the content placement in the caches, let pij denote the 
fraction of the traffic for which user i uses the cached paths 
to access content j. We can re-write the delay functions (|^ 
and 0 for the congestion-insensitive and the congestion- 
sensitive models as 


^(Pi^) “ \ ( 'y ^iQijPijdjji^td) + 'y ^ Xjqij (1 Pij)di 


and 


* J 




£’(p;x) = ^ 


XiQ^jp^jdijix)+XiQijii - Pij)di 
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respectively. The optimal routing policy for a given content 
placement x, then, is one that maximizes —Z)(p;x), and can 
be found by solving the following optimization problem; 


maximize — £)(p;x) 
such that 0 < pij <1 Vt, j 


( 6 ) 


Let xx be the equivalent binary representation of the con¬ 
tent placement set X. We have the following lemma for both 
congestion-insensitive and congestion-sensitive delay models; 


Lemma 4. Let V denote all routing policies. For X C S', 
the function L'(X) = maxpg-p —£)(p; xx) is a monotone 
increasing and submodular function. 


Proof. See the Appendix for a detailed proof. □ 


A direct consequence of Lemma is that minimizing the 
objective function in 0 or 0 is equivalent to maximizing 
a monotone submodular function. Therefore, the approximate 
solution obtained by the greedy algorithm in Algorith m [T] is 
within a (1 — 1/e) factor of the optimal solution (see ||17|). 

Algorithm starts with empty caches and at each step 
greedily adds a file to the cache that maximizes function F. 
This process continues until all caches are filled to capacity. 
Optimal routing is then determined based on the content 
placement. 

Although the greedy algorithm in Algorithm is guaranteed 
to find solutions within a (1 — 1/e) factor of the optimal 
solution, its complexity is high, log (XX)). We 








Algorithm 1 GreedyWG; A greedy approximation with per¬ 
formance guarantees. 

1: S' •(— {Sjm 1<TO<M} 

2: Xm ^ 0,ym 
3: X ^ 0 

4: for c ^ 1 to 'Em ^rn dO 

5: Sj.m* argmaxs^.^gsi^(A U {sjm}) 

6- -^m* ^ ^m* G {Sj*m*} 

7: X <— X U {Sj*m*} 

8: if \Xm> I = Cm- then 

9: S i S\Sjm* : Vj 

10 : else 

11: S -1— S\Sj*m* 

12: Content placement is done according to X. 

13: Determine the routing as p* ^ arg minp Z3(p; xx). 


devise a second, computationally more efficient, greedy al¬ 
gorithm in Algorithm with time complexity 0{M^NK). 
We do not have accuracy guarantees for Algorithm but in 
the next section, we will show that it performs very well in 
practice. 

Algorithm is based on the following ideas. It starts 
with empty caches and initializes the cache access delays 
for users as the miss delays to their closest caches. Then 
at each step a hie is greedily selected to be placed in a 
cache that maximizes the change in the user access delays, 
(dij — min{fiij, This process continues until 

the caches are hlled. Finally, similar to Algorithm [T] a routing 
policy that minimizes D(p;x) is determined. 

Algorithm 2 Greedy: A greedy approximation without known 
performance guarantees. 

1: Xm — 0,Vm 
2: A ^ 0 

3: ^ minc{(i™}, Vi, j 

4: for C ^ 1 to Em ^rn dO 
5: Gjm [0]xxM 

6: for TO ^ 1 to M do 

7: if |A„| < Cm then 

8: for jA— 1 to AT do 

Cjm ^ E/i Edijidij min{(ijj, dj^}) 

10: [j *, TO*] ^ arg Cjm 

11: d^m- ^ ^m* G 

12: X i — X U \Sj*m*} 

13: dij. ^ min{djj., 

14: Content placement is done according to X. 

15: Determine the routing as p* ^ arg minp D(p; xx) 


VII. Performance Evaluation 
In this section, we evaluate the performance of the approx¬ 
imate algorithms. Our goal here is to evaluate 1) how well 
the solutions of greedy algorithms compare to the optimal 


(when computing the optimal solution is feasible), and 2) how 
well solutions from the greedy algorithms compare to those 
produced by a baseline. Due to lack of space, we only consider 
the more realistic case of congestion-sensitive delay model. 
For our baseline, we compare the approximate algorithms to 
the delay obtained by the following algorithm we will refer to 
as p-LRU. 

A. p-LRU 

The cache replacement policy at all caches is Least Recently 
Used (LRU). For the routing policy, we assume that users that 
are not connected to any caches forward all their requests to 
the back-end servers. The remaining users, for each request, 
use a cached path with probability p and with probability 1—p 
forward the request to the uncached path. If user i decides to 
use a cached path, she chooses uniformly at random one of 
the rii caches she is connected to. The value of p is the same 
for all users that have access to a cache, and is optimized to 
minimize the average delay. 

First, assuming users equally split their traffic across the 
caches that they can access, the aggregate popularity for 
individual hies is computed at each cache. Let r™ denote the 
normalized aggregate popularity of hie j at cache to. We have 

= Eq^j|n^, 

ieXm 

where Xm denotes the set of users connected to cache to, 
and A is the normalizing constant across all hies. Note that 
r™ is independent of the parameter p. With the aggregate 
popularities at hand, hit probabilities are computed at each 
cache using the characteristic time approximation pH . Let 
^{xjm = 1) denote the probability that hie j resides in cache 
TO. From mi we have 

^{Xjm = 1) = 1 - exp {-rfTm), 

where Tm is the characteristic time of cache m is the unique 
solution to the equation 

^ 1 - exp i-rfTm). 
j 

Given the cache hit probabilities, the average delay in access¬ 
ing content j from caches for user i equals 

* mCiMi 

where Aii denotes the set of caches that user i is connected 
to. Note that \M.i \ = n^. 

Let I denote the set of users that are at least connected 
to one cache, and let Ax denote the aggregate request rate 
of these users. The average delay of accessing content from 
caches then equals 

Dc = E! E! ^ilijdij- 
iei j 

Remember that some users might not be connected to any 
caches. Considering the traffic from all users, we can write 










the overall average delay as 


-Dlru — 


pXiDc 


iGl i^I 
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p (1 p)'l2i£xXi 
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By differentiating I?lru with respect to p, the optimal value 
of p is found to be 


p* = max{0,min{l, 


B. Network Setup 


P- SiGX 


XxDc Xid\ 


— ^ + A 


solution for small problem instances. Here, we consider a 
network with five users and a single cache. User request 
rates are arbitrarily set to satisfy Ai = 5. We assume 
users are interested in 15 files, and that the aggregate user 
request popularities follow a Zipf distribution with skewness 
parameter 0.6. The service rate of the back-end server equals 
^ = 1 . 

Figure shows the average delay and the 95% confidence 
interval over 100 runs of each algorithm. It is clear that 
GreedyWG performs very close to optimal. In fact, we observe 
that GreedyWG differs from the optimal solution in less than 
/Ai}}. 20% of the time, and the relative inaccuracy is never more 
than 1%. 


We consider a network with users uniformly distributed in a 
square field. We consider two architectures. First, we assume 
there is only one large cache at the center of the network 
as in Figure Bi. Second, we consider a network with five 
small caches with equal storage capacity as in Figure |^. 
Figure]^ also shows the communication range of the caches in 
each case. In the single-cache network, the cache has a larger 
communication range and five times the capacity of each of 
the caches in the multi-cache network. 

Users that are not in the communication range of any caches 
can only use the uncached path to the back-end server. The 
hit delay for each user is linearly proportional to the distance 
from the cache and has the maximum valu^E] of 12.5 time 
units and 5.5 time units for the single and multi-cache systems, 
respectively. For a cache miss, an additional delay of 25 time 
units is added to the hit delay. The initial access delay of 
the uncached path is set to 5 time units for each user, and 
the service rate is proportional to the aggregate request rate, 
where the scaling factor will be specified later. 











^ y 


(a) (b) 



2) GreedyWG vi. Greedy: Next, we compare the solutions 
of GreedyWG against those of Greedy, the approximate algo¬ 
rithm, Algorithm with lower computational complexity but 
no performance guarantees. We consider a network with five 
caches and 100 users uniformly distributed in a 10 x 10 field. 

Figure shows the average delay and the 95% confi¬ 
dence interval for different values of available cache budget. 
Greedy (red curve) is barely distinguishable from GreedyWG 
(black curve), meaning that Greedy performs very close to 
GreedyWG. 

We also evaluate these algorithms over different values of 
the service rate at the back-end server. Figure shows the 
average delay for p between 2 to 7, with the aggregate traffic 
rate set to A = 5. Similar to Figure Greedy performs very 
close to GreedyWG, and is always within 1% of GreedyWG. 


Fig. 5: A network with (a) one cache, and (b) five caches. 


D. Trace-driven Simulation 


C. Numerical Evaluation 

1) GreedyWG vs. Optimal: First, we compare the solution 
of GreedyWG the approximate algorithm in Algorithm[2to the 
optimal solution. Due to the exponential complexity of finding 
the optimal solution, we are only able to compute the optimal 

^Here, delay aggregates all request propagation and download delays as 
well as the processing and queuing delays. We use normalized delay values 
instead of using any specific time unit. 


Here, we present trace-driven evaluation results where we 
use traces for web accesses collected from an industrial 
research lab. The trace consists of approximately 9 million 
requests generated from 142,000 distinct IP addresses for more 
than 3 million distinct files. We only consider Greedy, the 
greedy algorithm presented in Algorithm since it performs 
close to Algorithm [T] and has lower complexity. 

To evaluate the Greedy algorithm using the trace data, we 
first divide the trace into smaller segments of approximately 


















Fig. 7: Evaluation of the two greedy approximations over different values of 
the cache budget split equally between five caches. Aggregate user request 
rate is A = 5, and service rate of the back-end server equals 2.5. 



Fig. 9: Evaluation of the Greedy and p-LRU for the single-cache (S) and 
multi-cache (M) network setups for different values of the available cache 
budget. The service rate is set to be 0.8 times the aggregate traffic rate. 



Ratio of serivce rate to traffic load (|x/A.) 


Fig. 8: Evaluation of the greedy algorithms for different values of the service 
rate at the back-end server. Aggregate user request rate is A = 5, and the 
service rate vaiie from 2 to 10. Cache budget is set to 125. 



Ratio of serivce rate to traffic load (Jl/X) 


Fig. 10: Evaluation of the Greedy and p-LRU algorithms for different values 
of the service rate to aggregate traffic ratio for the single-cache (S) and multi¬ 
cache (M) network setups. 


120,000 requests. Each segment includes requests for approxi¬ 
mately 40,000 distinct files, generated by approximately 2500 
users. For every two consecutive segments, we use the first 
segment as learning dataset from which we compute the file 
popularities and determine the optimal value p for the p-LRU 
scheme as well as content placement and routing based on the 
Greedy algorithm. We use the second segment to compute the 
average delays under the p-LRU and Greedy algorithms. 

Figure compares the average delays for different cache 
budgets for the p-LRU and the Greedy algorithms for the 
single-cache (S) and multi-cache (M) networks. Significant 
reductions in average delay of up to 50% are observed for both 
single-cache and multi-cache networks when using Greedy 
over p-LRU. While p-LRU yields similar performance in both 
single-cache and multi-cache architectures, the Greedy shows 
the advantage of one architecture over the other depending on 
the cache budget. When the cache budget is small, it is better 
to have a single cache with larger cache size and coverage so 
that more users can access popular files from the cache; when 
the cache budget is large, it is better to have multiple caches. 


each with smaller size and coverage, so that users can access 
files from nearby caches with smaller hit delays. 

We also evaluate the algorithms for different values of the 
service rate of the uncached path assuming the cache budget is 
fixed at 10, 000. Figure 10 shows the average delay when the 
ratio of service rate to the total request rate changes from 0.6 
to 1.2. Similar to Figure]^ the Greedy algorithm significantly 
reduces the average content access delay. Again, the cache 
architecture makes little difference for p-LRU but significant 
affect to the performance of the Greedy algorithm. Moreover, 
the difference decreases as the service rate on the uncached 
path increases, as more traffic is offloaded to the uncached 
path. 


VIII. Related Work 

In this paper, we have considered the joint routing and 
cache-content management problems. Numerous past research 
efforts have considered these problems separately. The prob¬ 
lem of content placement in caches, has received significant 
attention in the Internet, in hybrid networks such as those con¬ 
sidered in this paper, and in sensor networks |3),@,@,|Tg- 





























ED- Baev et al. | [20) prove that the problem of content place¬ 
ment with the objective of minimizing the access delay is NP- 
complete, and present approximate algorithms. The separate 
problem of efficient routing in cache networks has also been 
explored in the literature p^-p4|. Rosensweig et al. \22\ 
propose Breadcrumbs - a simple, best-effort routing policy 
for locating cached content. Cache-aware routing schemes that 
calculate paths with minimum transportation costs based on 
given caching policy and request demand have been proposed 


in 


The joint caching and routing problem, with the objective 
of minimizing content access delay, has recently been studied 
in 0, 0, where the authors consider a hybrid network 
consisting of multiple femtocell caches and a cellular infras¬ 
tructure. Both papers assume that users greedily choose the 
minimum delay path to access content, i.e., requests for cached 
content are routed to caches (where content is know to reside), 
whereas remaining requests are routed to the (uncached) 
cellular network. They assume that the delays are constant 
and independent of the request rate. 

Our work differs from much of the previous research 
discussed above by considering a joint caching and routing 
problem, where we determine the optimal routes users should 
take for accessing content as well as the optimal caching 
policy. Our research differs from 0, 0 in that we consider 
heterogeneous delays between users and caches, consider a 
congestion-insensitive delay model for the uncached path as 
well as a congestion-sensitive model, investigate the problem’s 
time complexity, and propose bounded approximate solutions 
for both congestion-insensitive and congestion-sensitive sce¬ 
narios. We also determine scenarios for which the optimal 
solution can be found in polynomial time for the congestion- 
insensitive delay model, and ascertain the root cause of the 
complexity of the general problem. 

Recent work has also theoretically analyzed the benefits of 
content caching Q, |[25j-|[29|. 0, | [29| demonstrate that the 
asymptotic throughput capacity of a network is significantly 
increased by adding caching capabilities to the nodes. 


IX. Conclusion 

In this paper, we have considered the problem of joint 
content placement and routing in heterogeneous networks 
that support in-network caching but also provide a separate, 
single-hop (uncached) path to a back-end content server; we 
considered cases in which this uncached path was modeled as a 
congestion-insensitive, constant-delay path, and a congestion- 
sensitive path modeled as an M/M/1 queue. We provided 
fundamental complexity results showing that the problem 
of joint caching and routing is NP-complete in both cases, 
developed a greedy algorithm with guaranteed performance 
of (1 — 1/e) of the optimal solution as well as a lower 
complexity heuristic that was empirically found to provide 
average delay performance that was within 1% of optimal (for 
small instances of the problem) and that significantly reduce 
the average content access delay over the case of optimized 
traditional LRU caching. Our investigation of special-case 


scenarios — the congestion-insensitive two-cache case (where 
we demonstrated an optimal polynomial time solution) and the 
congestion-sensitive, single-cache, single-file-of-interest case 
(which we demonstrated remained NP-complete) — helped 
illuminate what makes the problem “hard” in general. Our 
future work is aimed at developing a distributed algorithm for 
content placement and routing, and on developing solutions 
for the case of time-varying content popularity. 
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Fig. 11: An example of the constraints matrix A for a network with two 
caches, two users and three files 
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Appendix A 

Network with Two Caches 

Proof. Consider the highlighted elements of the matrix in 
Figure [TT] and let ri denote the first row of the matrix. Also, 
let r 2 and denote the first two rows below the second 
horizontal line. It is easy to see that if these three rows are 
selected to be in R, any assignment satisfying Proposition 
should have —s(ri) = s(r 2 ) = sirs)- Otherwise, the signed 
sum of the rows will have entries other than {0,±1}. This 
observation can be easily extended to see that rows below the 
second horizontal line can be considered in groups of two 


such that if the two rows are selected to be in R they will be 
assigned the same sign. 

We sign the rows in R starting from the rows below the 
second horizontal line. Considering the groups of two rows, 
we make assignments such that the elements to the left of the 
vertical line of the signed sum of the rows are in {0, —1} only. 
To see why this is possible, note that the non-zero elements of 
the matrix to the left of the vertical line can be seen as small 
blocks of 2 X 2 matrices. It is easy to see that the signed sum 
of any subset of these blocks can be made to have elements 
only in {0,-1}, with rows in the same group getting the 
same assignment. The rows between the two horizontal lines 
are always signed +1. The sign of the rows above the first 
horizontal line follows the assignment of the lines below the 
second horizontal line based on the previous discussion. 

With the above procedure, the sum of the signed vectors 
will have entries in {0, ±1} for any set of rows R, and from 
Proposition]^ it follows that the matrix A is totally unimodular, 
and hence the solution for the optimization problem in ([^ for 
a network with two caches can be found in polynomial time. 

□ 


Appendix B 
Proof of Lemma 2 


Proof. It is easy to see that given some x, p the expected 
delay D{x, p) can be computed in polynomial time, and hence 
CSDDP is in NP. To show it is NP-hard, we reduce the 
problem of Equal Cardinality Partition (ECP) to our problem. 
Eor an instance of the ECP(A) problem we create the instance 
CSDDP(5, A, [|], [+oo], [A], |) where ,5 = 

Now, the set A can be partitioned into subsets Ai and A 2 
with |Ai| = IA 2 I if and only if CSDDP achieves delay (2n + 
3)/5. 

To see more clearly why the reduction works, first note that 
with the delay values being set to ^ and d\ = y, since 
d^ < d\ if a file exits in the cache all the requests for that file 
will be directed to the cache. Also, since d™ = + 00 , if a file is 
not in the cache all the requests for that file will be requested 
from the back-end server. Therefore, we have pi = Xifii. 
Now, with the service rate set to p = S, we can re-write the 
optimization problem in (|^ as follows 


minimize — 
o 




2=1 


2=1 


E n 


- 1 


such that 2 , ^ 7 : 


2=1 


Xi G {0,1} 


Now, looking at the objective function in Q 
that 

n 

Z\ = 4 (1 — xf) > 2n 

i=\ 


(7) 

we can see 


since we should have 777=1 Moreover, Z\ = 2n if 

777=1 meaning that exactly half of the files are in 










the cache. We also have that 


or equivalently 


A 


s 

E n 

i=i a^x, 


>4, 


and Z 2 = 4 only if = ^/‘^■ 

Hence, Zi+Z 2 — \ = 2n+3 if and only if = S/2 

^nd X]”=i Xi = n/2. 

Therefore, if CSDDP(5', A, [^], [+ 00 ], [^], |) achieves 
minimum delay (2n + 3)/S' then A can be partitioned into 
equal cardinality subsets. 

It is easy to see that if A can be partitioned into two subsets 
of equal cardinality, then CSDDP(S, A, [^], [+ 00 ], [^], has 
minimum delay of (2n + 3)/S. □ 


Appendix C 
Proof of Lemma 4 


hi 



( 10 ) 


Let X = {xjm} denote the set of files in the caches. Since 
adding an item to the cache will not increase dij, we have the 
following lemma: 


Lemma 5. For two content placement sets X and Y if X QY 
then d*x > dy- 

Note that based on the solution structure for the traffic 
forwarded to the uncached path we have that 

F 'y ^ ^iQijPij p y ^ > d } p y ^ XiQij, 

hj hj i,j 

or equivalently 


Proof. Since the congestion-insensitive case can be seen as a 
special case of the congestion-sensitive model with /i = -foo, 
we give a proof for the congestion-sensitive case only. 

It is easy to see that placing more content in the cache will 
not increase the delay, and this implies that F is a monotone 
increasing function of x. 

Writing the Lagrangian for the optimization in (|^ we get: 

L = x) + ^ - py) + ^ cr,jp„. 

id id 


Differentiating with respect to pij we have 


^iQijdij 

or equivalently 


ip - 'Ein EdijPzj) 


2 Y CTij — 0, 


1 


XiQijdij l^ij “t" fJjj 


ip-Ei,j^^dijPijY p^dij 

Using K.K.T. conditions, the following are necessary: 


( 8 ) 


v,j{l-p,j) = 0 

^ijPij ~ 0 
Pij — 1 

- Pij < 0 
^ij ^ Xfij ^0 


(9) 


From the above conditions, it is clear that if pij = 1, then 
Gij = 0. Also, if Pij — 0, then Vij = 0, and if 0 < pij < 1 
then (Jij = 0 and Vij = 0. Therefore, simplifies to 


1 

ip - Ei,j KqzjPzjV 





fj-Yqij 


P^3 = 1 
0 < p„ < 1 
Ptj = 0 


Since i/ij , cr^ > 0, there should exist some d* such that 
if dij < d* then pij = 0, and if dij > d* then pij — 1. 
Moreover, 

_1__ d* 

ip-E^,j^^qijPijY 


P* X] =d*} = p - \ -y^^Xiqijl{dij > d*} 


Using ( [TQ] > and ( [TT) we can simplify the delay function 


( 11 ) 

as 




-f (1 -p*)d* y] Xiq^jlidij = d*} + pJ -1 




= y] Xiqijdijl{d^j < d*} + cT y^ Xiq^jl{dij = d*} 
id id 

-d*ip- X] > d*}) + \fpct - 1 

id 

= -pd* + 2s/pd* - 1 + X] Xiqijdijl{dtj < d*} 

i,j 

+ Y,^^d^Jd*l{d,3>d*} 
id 

= -{s/pct - 1)^ + X! ^rdij minjd*, dj^}. 




Let X = {xjm,} denote the set of files in the caches and 
define 

fid;X) = {sfpd- if -X^A.gy min{d,d„}. (12) 




Let and d/l denote the cache access delay for user i for file 
j given content placement X and Y, respectively. If AT C F, 
then df > d/ and hence 

‘-J — 

fid; Y) - f{d; -Y) = X] Xiq^j (min{fi, df^} - min{d, d/j}) 
id 

(13) 

is a non-decreasing function of d. The following lemma 
summarizes this result. 

Lemma 6. For two sets X (~Y and for di < d 2 we have 
f{d2; Y) - fid2; X) > f{dpY) - /(di; X). 










Next, we consider the function 


-Dx = f{d*x;X) = min f{d;X). 

a 

Consider two sets X and Y such that X CY, then > dy 
due to Lemma Isj Now let A{X;xkm) = f{d*x\jxk 
XkTn) - f{d*x;^ and A{Y;xkm) = Xkm) - 

/{dy'jY) denote the gain obtained by adding Xkm to the sets 
X and Y, respectively. 

From Lemma |5] we have 

A(X; Xkm) > f{d*Y^x,^;X U Xkm) - fid*x-,X), 

and 


A(y; Xkm) < f{d*Y^x,X^Y U Xkm) " f{d*x;Y). 
Therefore, we have 

A(X;xk m ) - MY;xk m) 

— [/{dYUxkm ’ ^ ^ Xkm) ~ f {dyijx^^ i Y U Xfcm)] 

-[f{d*x;X)-f{d*x-,Y)]. 

Moreover, from Lemma we have 

A(A; Xkm) A(f^; Xkm) 

> [f{d*x;X U Xkm) - f{d*x; Y U Xkm)] 
-[f{d*x;X)-f{d*x;Y)] 

= [f{d*x;XUxkm)-fid*x;X)] 

-[f{d*x;YUxkm)-fid*x-,Y)] 

= (xnm{dx,df^j} - mm{d*x,df^^^'°^}^ 

ij' 

- Xiq^j (mm{d*x, dj^} - mm{d*x,dY^'’"'}^ 

> 0 , 

where the last inequality follows from the fact that rfA — 
_ ^Yvjxurr. if X CY. Note that dY'"’’" = 

mm{dfj,d^^}. 

From A{X;xkm) — A{Y ; Xkm) > 0 we conclude that F{x) 
is submodular. □ 


